Overview

Dataset statistics

Number of variables12
Number of observations2591730
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 GiB
Average record size in memory472.2 B

Variable types

Categorical3
Text4
Numeric5

Alerts

REF_DATE has constant value ""Constant
Knowledge of official languages (5):Total - Knowledge of official languages[1] is highly overall correlated with Knowledge of official languages (5):English only[2] and 2 other fieldsHigh correlation
Knowledge of official languages (5):English only[2] is highly overall correlated with Knowledge of official languages (5):Total - Knowledge of official languages[1] and 2 other fieldsHigh correlation
Knowledge of official languages (5):French only[3] is highly overall correlated with Knowledge of official languages (5):English and French[4]High correlation
Knowledge of official languages (5):English and French[4] is highly overall correlated with Knowledge of official languages (5):Total - Knowledge of official languages[1] and 3 other fieldsHigh correlation
Knowledge of official languages (5):Neither English nor French[5] is highly overall correlated with Knowledge of official languages (5):Total - Knowledge of official languages[1] and 2 other fieldsHigh correlation
Knowledge of official languages (5):Total - Knowledge of official languages[1] is highly skewed (γ1 = 270.4104924)Skewed
Knowledge of official languages (5):English only[2] is highly skewed (γ1 = 259.6271826)Skewed
Knowledge of official languages (5):French only[3] is highly skewed (γ1 = 249.2697714)Skewed
Knowledge of official languages (5):English and French[4] is highly skewed (γ1 = 235.2014829)Skewed
Knowledge of official languages (5):Neither English nor French[5] is highly skewed (γ1 = 204.0238494)Skewed
Gender (3) is uniformly distributedUniform
Age (15A) is uniformly distributedUniform
Coordinate has unique valuesUnique
Knowledge of official languages (5):Total - Knowledge of official languages[1] has 2090181 (80.6%) zerosZeros
Knowledge of official languages (5):English only[2] has 2157937 (83.3%) zerosZeros
Knowledge of official languages (5):French only[3] has 2513182 (97.0%) zerosZeros
Knowledge of official languages (5):English and French[4] has 2361395 (91.1%) zerosZeros
Knowledge of official languages (5):Neither English nor French[5] has 2451876 (94.6%) zerosZeros

Reproduction

Analysis started2023-10-04 23:51:31.553548
Analysis finished2023-10-04 23:52:24.761052
Duration53.21 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

REF_DATE
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size150.8 MiB
2021
2591730 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters10366920
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 2591730
100.0%

Length

2023-10-04T19:52:24.833063image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-04T19:52:24.913381image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2021 2591730
100.0%

Most occurring characters

ValueCountFrequency (%)
2 5183460
50.0%
0 2591730
25.0%
1 2591730
25.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10366920
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 5183460
50.0%
0 2591730
25.0%
1 2591730
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10366920
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 5183460
50.0%
0 2591730
25.0%
1 2591730
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10366920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 5183460
50.0%
0 2591730
25.0%
1 2591730
25.0%

GEO
Text

Distinct174
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size198.2 MiB
2023-10-04T19:52:25.069617image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length44
Median length35
Mean length21.827586
Min length5

Characters and Unicode

Total characters56571210
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCanada
2nd rowCanada
3rd rowCanada
4th rowCanada
5th rowCanada
ValueCountFrequency (%)
ca 1742715
20.3%
ont 640485
 
7.5%
cma 640485
 
7.5%
que 476640
 
5.5%
b.c 417060
 
4.9%
alta 253215
 
2.9%
sask 148950
 
1.7%
part 119160
 
1.4%
119160
 
1.4%
n.b 104265
 
1.2%
Other values (216) 3932280
45.8%
2023-10-04T19:52:25.351843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6002685
 
10.6%
. 3276900
 
5.8%
C 3187530
 
5.6%
a 3142845
 
5.6%
e 3008790
 
5.3%
A 2785365
 
4.9%
t 2725785
 
4.8%
n 2636415
 
4.7%
( 2502360
 
4.4%
) 2502360
 
4.4%
Other values (48) 24800175
43.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26989740
47.7%
Uppercase Letter 12422430
22.0%
Space Separator 6002685
 
10.6%
Other Punctuation 5749470
 
10.2%
Open Punctuation 2502360
 
4.4%
Close Punctuation 2502360
 
4.4%
Dash Punctuation 402165
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3142845
11.6%
e 3008790
11.1%
t 2725785
10.1%
n 2636415
9.8%
r 2159775
 
8.0%
o 2040615
 
7.6%
i 1727820
 
6.4%
l 1608660
 
6.0%
u 1266075
 
4.7%
s 1161810
 
4.3%
Other values (16) 5511150
20.4%
Uppercase Letter
ValueCountFrequency (%)
C 3187530
25.7%
A 2785365
22.4%
M 938385
 
7.6%
O 834120
 
6.7%
B 804330
 
6.5%
S 700065
 
5.6%
Q 625590
 
5.0%
N 476640
 
3.8%
L 268110
 
2.2%
P 253215
 
2.0%
Other values (14) 1549080
12.5%
Other Punctuation
ValueCountFrequency (%)
. 3276900
57.0%
, 2383200
41.5%
/ 59580
 
1.0%
' 29790
 
0.5%
Space Separator
ValueCountFrequency (%)
6002685
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2502360
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2502360
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 402165
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39412170
69.7%
Common 17159040
30.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 3187530
 
8.1%
a 3142845
 
8.0%
e 3008790
 
7.6%
A 2785365
 
7.1%
t 2725785
 
6.9%
n 2636415
 
6.7%
r 2159775
 
5.5%
o 2040615
 
5.2%
i 1727820
 
4.4%
l 1608660
 
4.1%
Other values (40) 14388570
36.5%
Common
ValueCountFrequency (%)
6002685
35.0%
. 3276900
19.1%
( 2502360
14.6%
) 2502360
14.6%
, 2383200
 
13.9%
- 402165
 
2.3%
/ 59580
 
0.3%
' 29790
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56496735
99.9%
None 74475
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6002685
 
10.6%
. 3276900
 
5.8%
C 3187530
 
5.6%
a 3142845
 
5.6%
e 3008790
 
5.3%
A 2785365
 
4.9%
t 2725785
 
4.8%
n 2636415
 
4.7%
( 2502360
 
4.4%
) 2502360
 
4.4%
Other values (45) 24725700
43.8%
None
ValueCountFrequency (%)
é 29790
40.0%
è 29790
40.0%
ÃŽ 14895
20.0%

DGUID
Text

Distinct174
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size170.6 MiB
2023-10-04T19:52:25.502417image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.028736
Min length11

Characters and Unicode

Total characters31175235
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021A000011124
2nd row2021A000011124
3rd row2021A000011124
4th row2021A000011124
5th row2021A000011124
ValueCountFrequency (%)
2021a000011124 14895
 
0.6%
2021a000212 14895
 
0.6%
2021s0504225 14895
 
0.6%
2021s0504015 14895
 
0.6%
2021s0504011 14895
 
0.6%
2021s0504010 14895
 
0.6%
2021s0503001 14895
 
0.6%
2021a000211 14895
 
0.6%
2021s0504105 14895
 
0.6%
2021s0504110 14895
 
0.6%
Other values (164) 2442780
94.3%
2023-10-04T19:52:25.744147image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9026370
29.0%
2 6121845
19.6%
5 3976965
12.8%
1 3262005
 
10.5%
4 2591730
 
8.3%
S 2383200
 
7.6%
3 1429920
 
4.6%
9 625590
 
2.0%
6 595800
 
1.9%
8 551115
 
1.8%
Other values (2) 610695
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 28583505
91.7%
Uppercase Letter 2591730
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9026370
31.6%
2 6121845
21.4%
5 3976965
13.9%
1 3262005
 
11.4%
4 2591730
 
9.1%
3 1429920
 
5.0%
9 625590
 
2.2%
6 595800
 
2.1%
8 551115
 
1.9%
7 402165
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
S 2383200
92.0%
A 208530
 
8.0%

Most occurring scripts

ValueCountFrequency (%)
Common 28583505
91.7%
Latin 2591730
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9026370
31.6%
2 6121845
21.4%
5 3976965
13.9%
1 3262005
 
11.4%
4 2591730
 
9.1%
3 1429920
 
5.0%
9 625590
 
2.2%
6 595800
 
2.1%
8 551115
 
1.9%
7 402165
 
1.4%
Latin
ValueCountFrequency (%)
S 2383200
92.0%
A 208530
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 31175235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9026370
29.0%
2 6121845
19.6%
5 3976965
12.8%
1 3262005
 
10.5%
4 2591730
 
8.3%
S 2383200
 
7.6%
3 1429920
 
4.6%
9 625590
 
2.0%
6 595800
 
1.9%
8 551115
 
1.8%
Other values (2) 610695
 
2.0%

Gender (3)
Categorical

UNIFORM 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size160.7 MiB
Total - Gender
863910 
Men+
863910 
Women+
863910 

Length

Max length14
Median length6
Mean length8
Min length4

Characters and Unicode

Total characters20733840
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTotal - Gender
2nd rowTotal - Gender
3rd rowTotal - Gender
4th rowTotal - Gender
5th rowTotal - Gender

Common Values

ValueCountFrequency (%)
Total - Gender 863910
33.3%
Men+ 863910
33.3%
Women+ 863910
33.3%

Length

2023-10-04T19:52:25.851416image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-04T19:52:25.939433image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
total 863910
20.0%
863910
20.0%
gender 863910
20.0%
men 863910
20.0%
women 863910
20.0%

Most occurring characters

ValueCountFrequency (%)
e 3455640
16.7%
n 2591730
12.5%
o 1727820
 
8.3%
1727820
 
8.3%
+ 1727820
 
8.3%
T 863910
 
4.2%
t 863910
 
4.2%
a 863910
 
4.2%
l 863910
 
4.2%
- 863910
 
4.2%
Other values (6) 5183460
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12958650
62.5%
Uppercase Letter 3455640
 
16.7%
Space Separator 1727820
 
8.3%
Math Symbol 1727820
 
8.3%
Dash Punctuation 863910
 
4.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3455640
26.7%
n 2591730
20.0%
o 1727820
13.3%
t 863910
 
6.7%
a 863910
 
6.7%
l 863910
 
6.7%
d 863910
 
6.7%
r 863910
 
6.7%
m 863910
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
T 863910
25.0%
G 863910
25.0%
M 863910
25.0%
W 863910
25.0%
Space Separator
ValueCountFrequency (%)
1727820
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1727820
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 863910
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16414290
79.2%
Common 4319550
 
20.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3455640
21.1%
n 2591730
15.8%
o 1727820
10.5%
T 863910
 
5.3%
t 863910
 
5.3%
a 863910
 
5.3%
l 863910
 
5.3%
G 863910
 
5.3%
d 863910
 
5.3%
r 863910
 
5.3%
Other values (3) 2591730
15.8%
Common
ValueCountFrequency (%)
1727820
40.0%
+ 1727820
40.0%
- 863910
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20733840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3455640
16.7%
n 2591730
12.5%
o 1727820
 
8.3%
1727820
 
8.3%
+ 1727820
 
8.3%
T 863910
 
4.2%
t 863910
 
4.2%
a 863910
 
4.2%
l 863910
 
4.2%
- 863910
 
4.2%
Other values (6) 5183460
25.0%

Age (15A)
Categorical

UNIFORM 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size174.7 MiB
Total - Age
 
172782
0 to 14 years
 
172782
0 to 4 years
 
172782
5 to 9 years
 
172782
10 to 14 years
 
172782
Other values (10)
1727820 

Length

Max length17
Median length14
Mean length13.666667
Min length11

Characters and Unicode

Total characters35420310
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTotal - Age
2nd rowTotal - Age
3rd rowTotal - Age
4th rowTotal - Age
5th rowTotal - Age

Common Values

ValueCountFrequency (%)
Total - Age 172782
 
6.7%
0 to 14 years 172782
 
6.7%
0 to 4 years 172782
 
6.7%
5 to 9 years 172782
 
6.7%
10 to 14 years 172782
 
6.7%
15 to 24 years 172782
 
6.7%
15 to 19 years 172782
 
6.7%
20 to 24 years 172782
 
6.7%
25 to 64 years 172782
 
6.7%
25 to 34 years 172782
 
6.7%
Other values (5) 863910
33.3%

Length

2023-10-04T19:52:26.020563image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
years 2418948
23.7%
to 2246166
22.0%
15 345564
 
3.4%
64 345564
 
3.4%
0 345564
 
3.4%
14 345564
 
3.4%
25 345564
 
3.4%
24 345564
 
3.4%
total 172782
 
1.7%
45 172782
 
1.7%
Other values (18) 3110076
30.5%

Most occurring characters

ValueCountFrequency (%)
7602408
21.5%
e 2764512
 
7.8%
a 2764512
 
7.8%
o 2591730
 
7.3%
r 2591730
 
7.3%
t 2418948
 
6.8%
s 2418948
 
6.8%
y 2418948
 
6.8%
4 2246166
 
6.3%
5 2073384
 
5.9%
Other values (15) 5529024
15.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18833238
53.2%
Decimal Number 8466318
23.9%
Space Separator 7602408
21.5%
Uppercase Letter 345564
 
1.0%
Dash Punctuation 172782
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2764512
14.7%
a 2764512
14.7%
o 2591730
13.8%
r 2591730
13.8%
t 2418948
12.8%
s 2418948
12.8%
y 2418948
12.8%
n 172782
 
0.9%
d 172782
 
0.9%
g 172782
 
0.9%
Other values (2) 345564
 
1.8%
Decimal Number
ValueCountFrequency (%)
4 2246166
26.5%
5 2073384
24.5%
1 1036692
12.2%
2 863910
 
10.2%
0 691128
 
8.2%
6 518346
 
6.1%
3 345564
 
4.1%
7 345564
 
4.1%
9 345564
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
T 172782
50.0%
A 172782
50.0%
Space Separator
ValueCountFrequency (%)
7602408
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 172782
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19178802
54.1%
Common 16241508
45.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2764512
14.4%
a 2764512
14.4%
o 2591730
13.5%
r 2591730
13.5%
t 2418948
12.6%
s 2418948
12.6%
y 2418948
12.6%
n 172782
 
0.9%
d 172782
 
0.9%
T 172782
 
0.9%
Other values (4) 691128
 
3.6%
Common
ValueCountFrequency (%)
7602408
46.8%
4 2246166
 
13.8%
5 2073384
 
12.8%
1 1036692
 
6.4%
2 863910
 
5.3%
0 691128
 
4.3%
6 518346
 
3.2%
3 345564
 
2.1%
7 345564
 
2.1%
9 345564
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35420310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7602408
21.5%
e 2764512
 
7.8%
a 2764512
 
7.8%
o 2591730
 
7.3%
r 2591730
 
7.3%
t 2418948
 
6.8%
s 2418948
 
6.8%
y 2418948
 
6.8%
4 2246166
 
6.3%
5 2073384
 
5.9%
Other values (15) 5529024
15.6%
Distinct331
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size177.6 MiB
2023-10-04T19:52:26.201982image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length46
Median length30
Mean length14.661631
Min length2

Characters and Unicode

Total characters37998990
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTotal - Mother tongue
2nd rowSingle responses
3rd rowOfficial languages
4th rowEnglish
5th rowFrench
ValueCountFrequency (%)
languages 743850
 
16.1%
n.i.e 266220
 
5.8%
n.o.s 117450
 
2.5%
cree 62640
 
1.4%
german 39150
 
0.8%
english 39150
 
0.8%
creole 39150
 
0.8%
non-official 39150
 
0.8%
sign 31320
 
0.7%
tutchone 31320
 
0.7%
Other values (340) 3210300
69.5%
2023-10-04T19:52:26.505526image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4838940
 
12.7%
n 3734910
 
9.8%
i 2826630
 
7.4%
e 2795310
 
7.4%
2027970
 
5.3%
g 2027970
 
5.3%
s 1855710
 
4.9%
l 1714770
 
4.5%
u 1675620
 
4.4%
o 1464210
 
3.9%
Other values (51) 13036950
34.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29479950
77.6%
Uppercase Letter 3664440
 
9.6%
Space Separator 2027970
 
5.3%
Other Punctuation 1659960
 
4.4%
Close Punctuation 399330
 
1.1%
Open Punctuation 399330
 
1.1%
Dash Punctuation 368010
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4838940
16.4%
n 3734910
12.7%
i 2826630
9.6%
e 2795310
9.5%
g 2027970
 
6.9%
s 1855710
 
6.3%
l 1714770
 
5.8%
u 1675620
 
5.7%
o 1464210
 
5.0%
r 1166670
 
4.0%
Other values (18) 5379210
18.2%
Uppercase Letter
ValueCountFrequency (%)
S 469800
12.8%
C 321030
 
8.8%
I 305370
 
8.3%
A 297540
 
8.1%
T 289710
 
7.9%
K 211410
 
5.8%
N 203580
 
5.6%
M 203580
 
5.6%
B 148770
 
4.1%
P 148770
 
4.1%
Other values (16) 1064880
29.1%
Other Punctuation
ValueCountFrequency (%)
. 1143180
68.9%
, 446310
 
26.9%
' 70470
 
4.2%
Space Separator
ValueCountFrequency (%)
2027970
100.0%
Close Punctuation
ValueCountFrequency (%)
) 399330
100.0%
Open Punctuation
ValueCountFrequency (%)
( 399330
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 368010
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33144390
87.2%
Common 4854600
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4838940
14.6%
n 3734910
11.3%
i 2826630
 
8.5%
e 2795310
 
8.4%
g 2027970
 
6.1%
s 1855710
 
5.6%
l 1714770
 
5.2%
u 1675620
 
5.1%
o 1464210
 
4.4%
r 1166670
 
3.5%
Other values (44) 9043650
27.3%
Common
ValueCountFrequency (%)
2027970
41.8%
. 1143180
23.5%
, 446310
 
9.2%
) 399330
 
8.2%
( 399330
 
8.2%
- 368010
 
7.6%
' 70470
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37967670
99.9%
None 31320
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4838940
 
12.7%
n 3734910
 
9.8%
i 2826630
 
7.4%
e 2795310
 
7.4%
2027970
 
5.3%
g 2027970
 
5.3%
s 1855710
 
4.9%
l 1714770
 
4.5%
u 1675620
 
4.4%
o 1464210
 
3.9%
Other values (48) 13005630
34.3%
None
ValueCountFrequency (%)
é 15660
50.0%
É 7830
25.0%
ò 7830
25.0%

Coordinate
Text

UNIQUE 

Distinct2591730
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size166.7 MiB
2023-10-04T19:52:28.264216image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length12
Median length11
Mean length10.453026
Min length7

Characters and Unicode

Total characters27091422
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2591730 ?
Unique (%)100.0%

Sample

1st row1.1.1.1
2nd row1.1.1.2
3rd row1.1.1.3
4th row1.1.1.4
5th row1.1.1.5
ValueCountFrequency (%)
1.1.1.1 1
 
< 0.1%
1.1.1.6 1
 
< 0.1%
1.1.1.22 1
 
< 0.1%
1.1.1.20 1
 
< 0.1%
1.1.1.78 1
 
< 0.1%
1.1.1.10 1
 
< 0.1%
1.1.1.3 1
 
< 0.1%
1.1.1.4 1
 
< 0.1%
1.1.1.5 1
 
< 0.1%
1.1.1.7 1
 
< 0.1%
Other values (2591720) 2591720
> 99.9%
2023-10-04T19:52:30.091071image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 7775190
28.7%
1 5291721
19.5%
2 3130074
11.6%
3 2534994
 
9.4%
4 1404864
 
5.2%
5 1389969
 
5.1%
6 1217187
 
4.5%
7 1142712
 
4.2%
8 1068237
 
3.9%
9 1068237
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19316232
71.3%
Other Punctuation 7775190
28.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5291721
27.4%
2 3130074
16.2%
3 2534994
13.1%
4 1404864
 
7.3%
5 1389969
 
7.2%
6 1217187
 
6.3%
7 1142712
 
5.9%
8 1068237
 
5.5%
9 1068237
 
5.5%
0 1068237
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 7775190
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27091422
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 7775190
28.7%
1 5291721
19.5%
2 3130074
11.6%
3 2534994
 
9.4%
4 1404864
 
5.2%
5 1389969
 
5.1%
6 1217187
 
4.5%
7 1142712
 
4.2%
8 1068237
 
3.9%
9 1068237
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27091422
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 7775190
28.7%
1 5291721
19.5%
2 3130074
11.6%
3 2534994
 
9.4%
4 1404864
 
5.2%
5 1389969
 
5.1%
6 1217187
 
4.5%
7 1142712
 
4.2%
8 1068237
 
3.9%
9 1068237
 
3.9%

Knowledge of official languages (5):Total - Knowledge of official languages[1]
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct12698
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1067.7934
Minimum0
Maximum36620955
Zeros2090181
Zeros (%)80.6%
Negative0
Negative (%)0.0%
Memory size19.8 MiB
2023-10-04T19:52:30.207729image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile145
Maximum36620955
Range36620955
Interquartile range (IQR)0

Descriptive statistics

Standard deviation65127.087
Coefficient of variation (CV)60.992215
Kurtosis108736.73
Mean1067.7934
Median Absolute Deviation (MAD)0
Skewness270.41049
Sum2.7674323 × 109
Variance4.2415375 × 109
MonotonicityNot monotonic
2023-10-04T19:52:30.306453image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2090181
80.6%
5 138734
 
5.4%
10 56122
 
2.2%
15 32646
 
1.3%
20 22668
 
0.9%
25 17008
 
0.7%
30 13478
 
0.5%
35 10818
 
0.4%
40 8975
 
0.3%
45 7717
 
0.3%
Other values (12688) 193383
 
7.5%
ValueCountFrequency (%)
0 2090181
80.6%
5 138734
 
5.4%
10 56122
 
2.2%
15 32646
 
1.3%
20 22668
 
0.9%
25 17008
 
0.7%
30 13478
 
0.5%
35 10818
 
0.4%
40 8975
 
0.3%
45 7717
 
0.3%
ValueCountFrequency (%)
36620955 1
< 0.1%
35145265 1
< 0.1%
27296445 1
< 0.1%
20107200 1
< 0.1%
19646805 1
< 0.1%
18866595 1
< 0.1%
18557080 1
< 0.1%
18063870 1
< 0.1%
17800600 1
< 0.1%
17344660 1
< 0.1%

Knowledge of official languages (5):English only[2]
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct10817
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean749.97096
Minimum0
Maximum25261655
Zeros2157937
Zeros (%)83.3%
Negative0
Negative (%)0.0%
Memory size19.8 MiB
2023-10-04T19:52:30.408410image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile85
Maximum25261655
Range25261655
Interquartile range (IQR)0

Descriptive statistics

Standard deviation47375.916
Coefficient of variation (CV)63.170333
Kurtosis97219.603
Mean749.97096
Median Absolute Deviation (MAD)0
Skewness259.62718
Sum1.9437222 × 109
Variance2.2444774 × 109
MonotonicityNot monotonic
2023-10-04T19:52:30.508010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2157937
83.3%
5 124602
 
4.8%
10 49527
 
1.9%
15 28415
 
1.1%
20 19617
 
0.8%
25 14728
 
0.6%
30 11541
 
0.4%
35 9212
 
0.4%
40 7737
 
0.3%
45 6681
 
0.3%
Other values (10807) 161733
 
6.2%
ValueCountFrequency (%)
0 2157937
83.3%
5 124602
 
4.8%
10 49527
 
1.9%
15 28415
 
1.1%
20 19617
 
0.8%
25 14728
 
0.6%
30 11541
 
0.4%
35 9212
 
0.4%
40 7737
 
0.3%
45 6681
 
0.3%
ValueCountFrequency (%)
25261655 1
< 0.1%
24306165 1
< 0.1%
18325325 1
< 0.1%
18285580 1
< 0.1%
13787630 1
< 0.1%
13248710 1
< 0.1%
12640800 1
< 0.1%
12620855 1
< 0.1%
12196575 1
< 0.1%
12154280 1
< 0.1%

Knowledge of official languages (5):French only[3]
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct3255
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.06102
Minimum0
Maximum4087895
Zeros2513182
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size19.8 MiB
2023-10-04T19:52:30.613111image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4087895
Range4087895
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10027.686
Coefficient of variation (CV)99.224074
Kurtosis79968.312
Mean101.06102
Median Absolute Deviation (MAD)0
Skewness249.26977
Sum2.6192287 × 108
Variance1.0055448 × 108
MonotonicityNot monotonic
2023-10-04T19:52:30.711123image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2513182
97.0%
5 27848
 
1.1%
10 9482
 
0.4%
15 5053
 
0.2%
20 3340
 
0.1%
25 2419
 
0.1%
30 1964
 
0.1%
35 1538
 
0.1%
40 1223
 
< 0.1%
45 1059
 
< 0.1%
Other values (3245) 24622
 
1.0%
ValueCountFrequency (%)
0 2513182
97.0%
5 27848
 
1.1%
10 9482
 
0.4%
15 5053
 
0.2%
20 3340
 
0.1%
25 2419
 
0.1%
30 1964
 
0.1%
35 1538
 
0.1%
40 1223
 
< 0.1%
45 1059
 
< 0.1%
ValueCountFrequency (%)
4087895 1
< 0.1%
4029960 1
< 0.1%
3980275 1
< 0.1%
3925600 1
< 0.1%
3734010 1
< 0.1%
3728020 1
< 0.1%
3638955 1
< 0.1%
3633980 1
< 0.1%
2173390 1
< 0.1%
2142665 1
< 0.1%

Knowledge of official languages (5):English and French[4]
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct5473
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean188.13551
Minimum0
Maximum6581680
Zeros2361395
Zeros (%)91.1%
Negative0
Negative (%)0.0%
Memory size19.8 MiB
2023-10-04T19:52:30.813543image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile10
Maximum6581680
Range6581680
Interquartile range (IQR)0

Descriptive statistics

Standard deviation13140.444
Coefficient of variation (CV)69.845635
Kurtosis78425.758
Mean188.13551
Median Absolute Deviation (MAD)0
Skewness235.20148
Sum4.8759644 × 108
Variance1.7267127 × 108
MonotonicityNot monotonic
2023-10-04T19:52:30.907282image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2361395
91.1%
5 74834
 
2.9%
10 26072
 
1.0%
15 14684
 
0.6%
20 10096
 
0.4%
25 7504
 
0.3%
30 5904
 
0.2%
35 4752
 
0.2%
40 3829
 
0.1%
45 3483
 
0.1%
Other values (5463) 79177
 
3.1%
ValueCountFrequency (%)
0 2361395
91.1%
5 74834
 
2.9%
10 26072
 
1.0%
15 14684
 
0.6%
20 10096
 
0.4%
25 7504
 
0.3%
30 5904
 
0.2%
35 4752
 
0.2%
40 3829
 
0.1%
45 3483
 
0.1%
ValueCountFrequency (%)
6581680 1
< 0.1%
6130560 1
< 0.1%
5226490 1
< 0.1%
3898980 1
< 0.1%
3766750 1
< 0.1%
3675005 1
< 0.1%
3554305 1
< 0.1%
3419880 1
< 0.1%
3337330 1
< 0.1%
3244350 1
< 0.1%

Knowledge of official languages (5):Neither English nor French[5]
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct2600
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.55237
Minimum0
Maximum689725
Zeros2451876
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size19.8 MiB
2023-10-04T19:52:31.007405image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5
Maximum689725
Range689725
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1603.2533
Coefficient of variation (CV)56.151322
Kurtosis63647.582
Mean28.55237
Median Absolute Deviation (MAD)0
Skewness204.02385
Sum74000035
Variance2570421.3
MonotonicityNot monotonic
2023-10-04T19:52:31.104790image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2451876
94.6%
5 52148
 
2.0%
10 18248
 
0.7%
15 9934
 
0.4%
20 6899
 
0.3%
25 4881
 
0.2%
30 3806
 
0.1%
35 2944
 
0.1%
40 2503
 
0.1%
45 2052
 
0.1%
Other values (2590) 36439
 
1.4%
ValueCountFrequency (%)
0 2451876
94.6%
5 52148
 
2.0%
10 18248
 
0.7%
15 9934
 
0.4%
20 6899
 
0.3%
25 4881
 
0.2%
30 3806
 
0.1%
35 2944
 
0.1%
40 2503
 
0.1%
45 2052
 
0.1%
ValueCountFrequency (%)
689725 1
< 0.1%
678580 1
< 0.1%
667955 1
< 0.1%
662420 1
< 0.1%
405555 1
< 0.1%
399290 1
< 0.1%
394210 1
< 0.1%
391515 1
< 0.1%
344545 1
< 0.1%
339275 1
< 0.1%

Interactions

2023-10-04T19:52:15.156724image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:09.585214image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:11.040407image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:12.371327image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:13.766837image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:15.434343image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:09.899150image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:11.301244image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:12.645555image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:14.049332image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:15.707003image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:10.172375image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:11.557829image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:12.920094image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:14.334364image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:15.995198image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:10.454494image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:11.828713image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:13.198645image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:14.608486image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:16.257632image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:10.728632image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:12.100446image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:13.480505image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-10-04T19:52:14.878522image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-10-04T19:52:31.190614image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Knowledge of official languages (5):Total - Knowledge of official languages[1]Knowledge of official languages (5):English only[2]Knowledge of official languages (5):French only[3]Knowledge of official languages (5):English and French[4]Knowledge of official languages (5):Neither English nor French[5]Gender (3)Age (15A)
Knowledge of official languages (5):Total - Knowledge of official languages[1]1.0000.9240.3980.6900.5430.0020.005
Knowledge of official languages (5):English only[2]0.9241.0000.2680.5970.5360.0020.005
Knowledge of official languages (5):French only[3]0.3980.2681.0000.5250.4090.0030.005
Knowledge of official languages (5):English and French[4]0.6900.5970.5251.0000.5420.0010.005
Knowledge of official languages (5):Neither English nor French[5]0.5430.5360.4090.5421.0000.0040.007
Gender (3)0.0020.0020.0030.0010.0041.0000.000
Age (15A)0.0050.0050.0050.0050.0070.0001.000

Missing values

2023-10-04T19:52:17.538714image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-04T19:52:19.926552image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

REF_DATEGEODGUIDGender (3)Age (15A)Mother tongue (331)CoordinateKnowledge of official languages (5):Total - Knowledge of official languages[1]Knowledge of official languages (5):English only[2]Knowledge of official languages (5):French only[3]Knowledge of official languages (5):English and French[4]Knowledge of official languages (5):Neither English nor French[5]
02021Canada2021A000011124Total - GenderTotal - AgeTotal - Mother tongue1.1.1.1366209552526165540878956581680689725
12021Canada2021A000011124Total - GenderTotal - AgeSingle responses1.1.1.2351452652430616540299606130560678580
22021Canada2021A000011124Total - GenderTotal - AgeOfficial languages1.1.1.327296445183253253734010522649010620
32021Canada2021A000011124Total - GenderTotal - AgeEnglish1.1.1.42010720018285580599018066059025
42021Canada2021A000011124Total - GenderTotal - AgeFrench1.1.1.5718924539740372802034198801595
52021Canada2021A000011124Total - GenderTotal - AgeNon-official languages1.1.1.678488205980845295950904065667955
62021Canada2021A000011124Total - GenderTotal - AgeIndigenous languages1.1.1.71488951235801099587855535
72021Canada2021A000011124Total - GenderTotal - AgeAlgonquian languages1.1.1.897125790201073056251760
82021Canada2021A000011124Total - GenderTotal - AgeBlackfoot1.1.1.92520248002510
92021Canada2021A000011124Total - GenderTotal - AgeCree-Innu languages1.1.1.1067665510301040547801455
REF_DATEGEODGUIDGender (3)Age (15A)Mother tongue (331)CoordinateKnowledge of official languages (5):Total - Knowledge of official languages[1]Knowledge of official languages (5):English only[2]Knowledge of official languages (5):French only[3]Knowledge of official languages (5):English and French[4]Knowledge of official languages (5):Neither English nor French[5]
25917202021Nunavut2021A000262Women+75 years and overEstonian174.3.15.32200000
25917212021Nunavut2021A000262Women+75 years and overFinnish174.3.15.32300000
25917222021Nunavut2021A000262Women+75 years and overHungarian174.3.15.32400000
25917232021Nunavut2021A000262Women+75 years and overOther languages, n.i.e.174.3.15.32500000
25917242021Nunavut2021A000262Women+75 years and overMultiple responses174.3.15.3261515000
25917252021Nunavut2021A000262Women+75 years and overEnglish and French174.3.15.32700000
25917262021Nunavut2021A000262Women+75 years and overEnglish and non-official language(s)174.3.15.3281515000
25917272021Nunavut2021A000262Women+75 years and overFrench and non-official language(s)174.3.15.32900000
25917282021Nunavut2021A000262Women+75 years and overEnglish, French and non-official language(s)174.3.15.33000000
25917292021Nunavut2021A000262Women+75 years and overMultiple non-official languages174.3.15.33100000